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CHAPTER 





Measures of Dispersion 





Studying this chapter should 
enable you to: 
know the limitations of averages; 
appreciate the need for measures 
of dispersion: 


enumerate various measures of 
dispersion; 

calculate the measures and 
compare them; 

distinguish between absolute 
and relative measures. 


1. INTRODUCTION 


In the previous chapter, you have 
studied how to sum up the data into a 
single representative value. However, 
that value does not reveal the variability 
present in the data. In this chapter you 
will study those measures, which seek 
to quantify variability of the data. 





Three friends, Ram, Rahim and 
Maria are chatting over a cup of tea. 
During the course of their conversation, 
they start talking about their family 
incomes. Ram tells them that there are 
four members in his family and the 
average income per member is Rs 
15,000. Rahim says that the average 
income is the same in his family, though 
the number of members is six. Maria 
says that there are five members in her 
family, out of which one is not working. 
She calculates that the average income 
in her family too, is Rs 15,000. They 
are a little surprised since they know 
that Maria’s father is earning a huge 
salary. They go into details and gather 
the following data: 
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Family Incomes 
SL No. Ram Rahim Maria 
1; 12,000 7,000 O 
2. 14,000 10,000 7,000 
3. 16,000 14,000 8,000 
4. 18,000 17,000 10,000 
2 e 20,000 50,000 
6. e 22,000  ------ 
Total income 60,000 90,000 75,000 
Average income 15,000 15,000 15,000 


Do you notice that although the 
average is the same, there are 
considerable differences in individual 
incomes? 

It is quite obvious that averages try 
to tell only one aspect of a distribution 
i.e. a representative size of the values. 
To understand it better, you need to 
know the spread of values also. 

You can see that in Ram’s family, 
differences in incomes are 
comparatively lower. In Rahim’s family, 
differences are higher and in Maria’s 
family, the differences are the highest. 
Knowledge of only average is 
insufficient. If you have another value 
which reflects the quantum of variation 
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in values, your understanding of a 
distribution improves considerably. 
For example, per capita income gives 
only the average income. A measure of 
dispersion can tell you about income 
inequalities, thereby improving the 
understanding of the relative standards 
of living enjoyed by different strata of 
society. 

Dispersion is the extent to which 
values in a distribution differ from the 
average of the distribution. 

To quantify the extent of the 
variation, there are certain measures 
namely: 

(i) Range 

(ii) Quartile Deviation 
(iii) Mean Deviation 

(iv) Standard Deviation 


Apart from these measures which 
give a numerical value, there is a 
graphic method for estimating 
dispersion. 

Range and quartile deviation 
measure the dispersion by calculating 
the spread within which the values lie. 
Mean deviation and standard deviation 
calculate the extent to which the values 
differ from the average. 


2. MEASURES BASED UPON SPREAD 
OF VALUES 


Range 


Range (R) is the difference between the 
largest (L) and the smallest value (S) in 
a distribution. Thus, 
R=L-S 

Higher value of range implies higher 
dispersion and vice-versa. 
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ie Activities 


Look at the following values: 

20, 30, 40, 50, 200 

e Calculate the Range. 

e What is the Range if the value 
200 is not present in the data 
set? 

e If 50 is replaced by 150, what 
will be the Range? 


Range: Comments 


Range is unduly affected by extreme 
values. It is not based on all the 
values. As long as the minimum 


and maximum values remain 
unaltered, any change in other 
values does not affect range. It 
cannot be calculated for open- 
ended frequency distribution. 





Notwithstanding some limitations, 
range is understood and used 
frequently because of its simplicity. For 
example, we see the maximum and 
minimum temperatures of different 
cities almost daily on our TV screens 
and form judgments about the 
temperature variations in them. 


Open-ended distributions are those 
in which either the lower limit of 


the lowest class or the upper limit 
of the highest class or both are not 
specified. 





| Activity 


e Collect data about 52-week high/ 
low of shares of 10 companies 
from a newspaper. Calculate the 
range of share prices. Which 
company’s share is most volatile 
and which is the most stable? 


STATISTICS FOR ECONOMICS 


Quartile Deviation 


The presence of even one extremely 
high or low value in a distribution can 
reduce the utility of range as a measure 
of dispersion. Thus, you may need a 
measure which is not unduly affected 
by the outliers. 

In such a situation, if the entire data 
is divided into four equal parts, each 
containing 25% of the values, we get 
the values of quartiles and median. 
(You have already read about these in 
Chapter 5). 

The upper and lower quartiles (Q, 
and Q,, respectively) are used to 
calculate inter-quartile range which is 
Q,-Q,. 

Interquartile range is based upon 
middle 50% of the values ina 
distribution and is, therefore, not 
affected by extreme values. Half of the 
inter-quartile range is called quartile 
deviation (Q.D.). Thus: 


_ Q-Q: 
Q.D.= == 


Q.D. is therefore also called Semi- 
Inter Quartile Range. 


Calculation of Range and Q.D. for 
ungrouped data 


Example 1 


Calculate range and Q.D. of the 
following observations: 
20, 25, 29, 30, 35, 39, 41, 
48, 51, 60 and 70 
Range is clearly 70 — 20 = 50 
For Q.D., we need to calculate 
values of Q, and Q.. 
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ri 1 
Q, is the size of “a value. 


n being 11, Q, is the size of 3rd value. 

As the values are already arranged 
in ascending order, it can be seen that 
Q,, the 3rd value is 29. [What will you 
do if these values are not in an order?] 


3(n+1) 
4 


value; i.e. 9th value which is 51. Hence 
Q, = 51 


Similarly, Q, is size of th 


Q; -Q = 91-29 = ]]l 
2 2 
Do you notice that Q.D. is the 
average difference of the Quartiles from 


the median. 


E Activity 


e Calculate the median and 
check whether the above 
statement is correct. 





Q.D.= 


Calculation of Range and Q.D. for a 
frequency distribution. 


Example 2 


For the following distribution of marks 
scored by a class of 40 students, 
calculate the Range and Q.D. 


TABLE 6.1 
Class intervals No. of students 
CI (f) 
0-10 5 
10-20 8 
20-40 16 
40-60 7 
60-90 4 
40 


TT 


Range is just the difference between 
the upper limit of the highest class and 
the lower limit of the lowest class. So 
range is 90 — O = 90. For Q.D., first 
calculate cumulative frequencies as 
follows: 


Class- Frequencies Cumulative 
Intervals Frequencies 
CI f c. Í. 
0-10 5 05 
10-20 8 PS 
20-40 16 29 
40-60 7 36 
60-90 4 40 
n=40 


n 
Q, is the size of o value in a 


continuous series. Thus, it is the size 
of the 10® value. The class containing 
the 10" value is 10-20. Hence, Q, lies 
in class 10-20. Now, to calculate the 
exact value of Q,, the following formula 
is used: 

n-cf 


Q =L+4— xi 
i f 

Where L = 10 (lower limit of the 
relevant Quartile class) 

c.f. = 5 (Value of c.f. for the class 

preceding the quartile class) 

i = 10 (interval of the quartile class), 
and 

f = 8 (frequency of the quartile class) 
Thus, 








1 
Q=10+ x 10=16.25 


on 
Similarly, Q, is the size of ao 
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value; i.e., 30th value, which lies in 
class 40-60. Now using the formula 
for Q., its value can be calculated as 
follows: 





Sn og 
en =L+ a x i 
Q; = 40+ aes 290 
Q, = 42.87 
Q.D. = ee = 13 3] 


In individual and discrete series, 


n+l 
Q, is the size of a th value, but 


in a continuous distribution, it is 


n 
the size of a value. 


for Q, and median also, n is used in 
place of n+1. 


Similarly, 





If the entire group is divided into 
two equal halves and the median 
calculated for each half, you will have 
the median of better students and the 
median of weak students. These 
medians differ from the median of the 
entire group by 13.31 on an average. 
Similarly, suppose you have data 
about incomes of people of a town. 
Median income of all people can be 
calculated. Now, if all people are 
divided into two equal groups of rich 
and poor, medians of both groups can 
be calculated. Quartile deviation will 
tell you the average difference between 
medians of these two groups belonging 
to rich and poor, from the median of 
the entire group. 


STATISTICS FOR ECONOMICS 


Quartile deviation can generally be 
calculated for open-ended 
distributions and is not unduly affected 
by extreme values. 


3. MEASURES OF DISPERSION FROM 
AVERAGE 


Recall that dispersion was defined as 
the extent to which values differ from 
their average. Range and quartile 
deviation are not useful in measuring, 
how far the values are, from their 
average. Yet, by calculating the spread 
of values, they do give a good idea 
about the dispersion. Two measures 
which are based upon deviation of the 
values from their average are Mean 
Deviation and Standard Deviation. 

Since the average is a central value, 
some deviations are positive and some 
are negative. If these are added as they 
are, the sum will not reveal anything. 
In fact, the sum of deviations from 
Arithmetic Mean is always zero. Look 
at the following two sets of values. 


Set A: 5, 9, 16 
Set B: l, 9, 20 


You can see that values in Set B are 
farther from the average and hence 
more dispersed than values in Set A. 
Calculate the deviations from 
Arithmetic Mean and sum them up. 
What do you notice? Repeat the same 
with Median. Can you comment upon 
the quantum of variation from the 
calculated values? 

Mean Deviation tries to overcome 
this problem by ignoring the signs of 
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deviations, i.e., it considers all 
deviations positive. For standard 
deviation, the deviations are first 
squared and averaged and then square 
root of the average is found. We shall 
now discuss them separately in detail. 


Mean Deviation 


Suppose a college is proposed for 
students of five towns A, B, C, Dand E 
which lie in that order along a road. 
Distances of towns in kilometres from 
town A and number of students in 
these towns are given below: 


Town Distance No. 
from town A of Students 

A O 90 
B 2 150 
C 6 100 
D 14 200 
E 18 80 

620 


Now, if the college is situated in 
town A, 150 students from town B will 
have to travel 2 kilometers each (a total 
of 300 kilometres) to reach the college. 
The objective is to find a location so that 
the average distance travelled by 
students is minimum. 

You may observe that the students 
will have to travel more, on an average, 
if the college is situated at town A or E. 
If on the other hand, it is somewhere in 
the middle, they are likely to travel less. 
Mean deviation is the appropriate 
statistical tool to estimate the average 
distance travelled by students. Mean 
deviation is the arithmetic mean of the 
differences of the values from their 
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average. The average used is either the 
arithmetic mean or median. 

(Since the mode is not a stable 
average, itis not used to calculate mean 
deviation.) 

Activities 

e Calculate the total distance to 
be travelled by students if the 
college is situated at town A, at 
town C, or town E and also if it 
is exactly half way between A 
and E. 

e Decide where, in your opinion, 
the college should be establi- 
shed, if there is only one 
student in each town. Does it 
change your answer? 


Calculation of Mean Deviation from 
Arithmetic Mean for ungrouped 
data. 


Direct Method 
Steps: 


(i) The A.M. of the values is calculated 

(ii) Difference between each value and 
the A.M. is calculated. All differences 
are considered positive. These are 
denoted as ld l 

(iii) The A.M. of these differences (called 
deviations) is the Mean Deviation. 

idl 


i.e. M.D. = — 
n 


Example 3 


Calculate the mean deviation of the 
following values; 2, 4, 7, 8 and 9. 


X 
The AM.==~=6 
n 


2020-21 


80 


X idl 
2, 4 
4 2, 
7 1 
8 2, 
9 3 
12 
12 
M.D. — = — = 2.4 
(x) 5 


Mean Deviation from median for 
ungrouped data. 


Method 


Using the values in Example 3, M.D. 

from the Median can be calculated as 

follows, 

(i) Calculate the median which is 7. 

(ii) Calculate the absolute deviations 
from median, denote them as | dl. 

(iii) Find the average of these absolute 
deviations. It is the Mean Deviation. 


Example 5 


X d=|X-MEDIAN | 


OON FN 
ml Or OW Ul 


1 


M. D. from Median is thus, 


Idi 11 
> =— = 2,2, 
i 5 





M.D. = 


(Median) 
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Mean Deviation from Mean for 
Continuous Distribution 


TABLE 6.2 


Profits of 
companies 

(Rs in lakh) 
Class intervals 


10-20 


5 
20-30 8 
30-50 16 
8 
3 


Number of 
Companies 


50-70 
70-80 


40 


Steps: 


i) Calculate the mean of the 
distribution. 


(ii) Calculate the absolute deviations 
ld! of the class midpoints from the 
mean. 
(iii) Multiply each Id! value with its 
corresponding frequency to get fid l| 
values. Sum them up to get Lfldl. 


(iv) Apply the following formula, 


f | 
ae aAa 
(x) > f 


Mean Deviation of the distribution 
in Table 6.2 can be calculated as 
follows: 


Example 6 
GI. f m.p. idl fidil 
10-20 5 15 25:5 127.5 
20-30 8 25 15.5 124.0 
30-50 16 40 0.5 8.0 
50-70 8 60 19.5 156.0 
70-80 3 TO 34.5 103.5 
40 519.0 
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aes Pe 9678 
(x) 5f 40 


Mean Deviation from Median 


TABLE 6.3 

Class intervals Frequencies 
20-30 5 
30-40 10 
40-60 20 
60-80 9 
80-90 6 

50 


The procedure to calculate mean 
deviation from the median is the same 
as it is in case of M.D. from mean, 
except that deviations are to be taken 
from the median as given below: 





Example 7 
Ca. f mp. idl fidli 
20-30 5 25 25 125 
30-40 10 35 15 150 
40-60 20 50 O O 
60-80 9 70 20 180 
80-90 6 85 35 210 
50 665 
fidli 
M.D. Median) y f 
665 
=-— = 13.5 





Mean Deviation: Comments 


Mean deviation is based on all 
values. A change in even one value 
will affect it. Mean deviation is the 
least when calculated from the 
median i.e., it will be higher if 
calculated from the mean. However 
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it ignores the signs of deviations 
and cannot be calculated for open- 
ended distributions. 





Standard Deviation 


Standard Deviation is the positive 
square root of the mean of squared 
deviations from mean. So if there are 
five values x,, X,, X,, X, and x,, first their 
mean is calculated. Then deviations of 
the values from mean are calculated. 
These deviations are then squared. The 
mean of these squared deviations is the 
variance. Positive square root of the 
variance is the standard deviation. 

(Note that standard deviation is 
calculated on the basis of the mean only). 


Calculation of Standard Deviation 
for ungrouped data 


Four alternative methods are available 
for the calculation of standard 
deviation of individual values. All these 
methods result in the same value of 
standard deviation. These are: 


(i) Actual Mean Method 

(ii) Assumed Mean Method 
(iii) Direct Method 

(iv) Step-Deviation Method 


Actual Mean Method: 


Suppose you have to calculate the 
standard deviation of the following 
values: 

5, 10, 25, 30, 50 
First step is to calculate 


5+10+25+30+50 120 _ 
5 5 


x= 24 
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Example 8 

x d (x-x) d? 
5 -19 361 
10 -14 196 
25 +1 1 
30 +6 36 
50 +26 676 


O 1270 


Then the following formula is used: 


(Za? 
0o = 
n 





SXK) 
o= 
n 
1270 
o= |7 = 254 =15.937 


Do you notice the value from which 
deviations have been calculated in the 
above example? Is it the Actual Mean? 


Assumed Mean Method 


For the same values, deviations may be 
calculated from any arbitrary value 
Ax such that d =X-Ax. Taking Ax 
= 25, the computation of the standard 
deviation is shown below: 


Example 9 

X d (x-AX) a? 
5 220 400 
10 -15 225 
25 O O 
30 +5 25 
50 +25 625 


-9 I279 
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Formula for Standard Deviation 
sa? 23) 
O = ,/—— — | — 
n n 


2 
o= La = =] — J254 = 15.937 


Note that the sum of deviations 
from a value other than actual 
mean will not be equal to zero. 

Standard deviation is not affected 
by the value of the constant from 


which deviations are calculated. 
The value of the constant does not 
figure in the standard deviation 
formula. Thus, Standard deviation 
is Independent of Origin. 





Direct Method 


Standard Deviation can also be 
calculated from the values directly, i.e., 
without taking deviations, as shown 
below: 


Example 10 

2 
X X 
5 25 
10 100 
25 625 
30 900 
50 2500 
120 4150 


(This amounts to taking deviations 
from zero) 
Following formula is used. 


2 pn 
o= ee. (x)? 
n 
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or o=,/254 = 15.937 


Step-deviation Method 


If the values are divisible by a common 
factor, they can be so divided and 
standard deviation can be calculated 
from the resultant values as follows: 


Example 11 


Since all the five values are divisible by 
a common factor 5, we divide and get 
the following values: 


x x d'= (x'-x') d’? 

5 1 -3.8 14.44 

10 2 2.8 7.84 

25 5 +0.2 0.04 

30 6 +1.2 1.44 

50 10 +52 27.04 

O 50.80 

In the above table, 

o x 
x'=— 
C 


where c = common factor 
First step is to calculate 


1+2+5+6+10 x 24 te 


5 5 


The following formula is used to 
calculate standard deviation: 


| d’ 
O = 7 xC 
n 


Substituting the values, 


X= 4.8 
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50.80 
5 


o=ẸĘ410.16 x5 


0 =15.937 





x5 


Alternatively, instead of dividing the 
values by a common factor, the 
deviations can be calculated and then 
divided by a common factor. 


Standard deviation can be 
calculated as shown below: 


Example 12 

x d =(x-25) d’=(d/5) d’? 

5 -20 —4 16 

10 -15 -3 9 

25 O O O 

30 RD +] 1 

50 25 +5 25 

=] 51 


Deviations have been calculated 
from an arbitrary value 25. Common 
factor of 5 has been used to divide 
deviations. 


2 


Za’ (22) 
O = — x C 


n n 


2 
51 / -1 
5 2 


o=vV10.16x5=15.937 





Standard deviation is not independent 
of scale. Thus, if the values or 


deviations are divided by a common 


factor, the value of the common factor 
is used in the formula to get the value 
of standard deviation. 
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Standard Deviation in Continuous 
frequency distribution: 


Like ungrouped data, S.D. can be 
calculated for grouped data by any of 
the following methods: 

(ij) Actual Mean Method 

(ii) Assumed Mean Method 

(iii) Step-Deviation Method 


Actual Mean Method 


For the values in Table 6.2, Standard 
Deviation can be calculated as follows: 


Example 13 


(1) 2) (3) (4 (5) (6) (7) 
CI f m fm d fd fd? 


10-20 5 15 75 -25.5 -127.5 32515 
20-30 8 25 200 -15.5 -124.0 1922.00 
30-50 16 40 640 -0.5 -8.0 4.00 
50-70 8 60 480 +19.5 +156.0 3042.00 
70-80 3 75 225 +34.5 +103.5 3570.75 


40 1620 O 11790.00 


Following steps are required: 
1. Calculate the mean of the 


distribution. 
z- Xfm _ 1620 -405 
Xf 40 


2. Calculate deviations of mid-values 
from the mean so that d=m- X 
(Col. 5) 

3. Multiply the deviations with their 
corresponding frequencies to get 
‘fd’ values (Col. 6) [Note that y fd 
= 0] 

4. Calculate ‘fd? values by 
multiplying ‘fd’ values with ‘d’ 
values. (Col. 7). Sum up these to 
get X fd?. 
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5. Apply the formula as under: 


2 
ae [Xfd _ ERCE 
n 40 


Assumed Mean Method 








For the values in example 13, standard 
deviation can be calculated by taking 
deviations from an assumed mean (say 
40) as follows: 


Example 14 
(1) (2) (3) (4) (5) (6) 
CI f m d Ja Jæ 
10-20 5 15 -25 -125 3125 
20-30 8 25 -15 -120 1800 
30-50 16 40 O O O 
50-70 8 60 +20 160 3200 
70-80 3 PS +35 105 3675 
40 +20 11800 


The following steps are required: 

1. Calculate mid-points of classes 
(Col. 3) 

2. Calculate deviations of mid-points 
from an assumed mean such that 
d = m — A —(Col. 4). Assumed 
Mean = 40. 

3. Multiply values of ‘d’ with 
corresponding frequencies to get 
‘fd’ values (Col. 5). (Note that the 
total of this column is not zero since 
deviations have been taken from 
assumed mean). 

4. Multiply ‘fd’ values (Col. 5) with ‘d’ 
values (col. 4) to get fd? values (Col. 
6). Find Y fd?. 

5. Standard Deviation can be 
calculated by the following formula. 
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rfd? _ 





ar 
BIE 





or O= 


= 


or © = 294.75 =17.168 


Step-deviation Method 


In case the values of deviations are 
divisible by a common factor, the 
calculations can be simplified by the 
step-deviation method as in the 
following example. 


Example 15 


(1) (2) (3) (4) (5) (6) (7) 
CI f m d ada fd fd 


10-20 5 15 -25 -5 -29 125 
20-30 8 25 -l5 -3 -24 72 
30-50 16 40 O O O O 
50-70 8 60 +20 +4 +32 128 
70-80 3 75 +35 +7 I 147 


40 +4 472 


Steps required: 


1. Calculate class mid-points (Col. 3) 
and deviations from an arbitrarily 
chosen value, just like in the 
assumed mean method. In this 
example, deviations have been 
taken from the value 40. (Col. 4) 


2. Divide the deviations by a common 
factor denoted as ‘c’. c = 5 in the 
above example. The values so 
obtained are ‘d” values (Col. 5). 


3. Multiply ‘d? values with 
corresponding ‘f’ values (Col. 2) to 
obtain ‘fd” values (Col. 6). 
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4. Multiply ‘fd” values with ‘d” values 
to get ‘fd'” values (Col. 7) 

5. Sum up values in Col. 6 and Col. 7 
to get X fd' and y fd’? values. 


6. Apply the following formula. 








Did’? -(22) 7 
rf E 


472 (4 
Or o = x5 
40 (40 


or o =Ẹ411.8-0.01 x5 


or o =,/11.79 x5 


o =17.168 


Standard Deviation: Comments 


Standard Deviation, the most widely 
used measure of dispersion, is 
based on all values. Therefore a 


change in even one value affects 
the value of standard deviation. It 
is independent of origin but not of 


scale. It is also useful in certain 
advanced statistical problems. 


4, ABSOLUTE AND RELATIVE MEASURES 
OF DISPERSION 


All the measures, described so far, are 
absolute measures of dispersion. They 
calculate a value which, at times, is 
difficult to interpret. For example, 
consider the following two data sets: 


set A 500 700 1000 
Set B 1,00,000 1,20,000 1,30,000 


Suppose the values in Set A are the 
daily sales recorded by an ice-cream 
vendor, while Set B has the daily sales 
of a big departmental store. Range for 
Set A is 500 whereas for Set B, it is 
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30,000. The value of Range is much 
higher in Set B. Can you say that the 
variation in sales is higher for the 
departmental store? It can be easily 
observed that the highest value in Set 
A is double the smallest value, whereas 
for the Set B, it is only 30% higher. 
Thus, absolute measures may give 
misleading ideas about the extent of 
variation specially when the averages 
differ significantly. 

Another weakness of absolute 
measures is that they give the answer 
in the units in which original values are 
expressed. Consequently, if the values 
are expressed in kilometers, the 
dispersion will also be in kilometers. 
However, if the same values are 
expressed in meters, an absolute 
measure will give the answer in meters 
and the value of dispersion will appear 
to be 1000 times. 

To overcome these problems, 
relative measures of dispersion can be 
used. Each absolute measure has a 
relative counterpart. Thus, for range, 
there is coefficient of range which is 
calculated as follows: 


L-—S 
Lye 
whereL =Largest value 

S =Smallest value 


Coefficient of Range = 


Similarly, for Quartile Deviation, it 
is Coefficient of Quartile Deviation 
which can be calculated as follows: 

Coefficient of Quartile Deviation 


Q; 7 Q; 
Q. +Q where Q,=3" Quartile 
3 1 


Q, = 1* Quartile 
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For Mean Deviation, it is Coefficient 
of Mean Deviation. 
Coefficient of Mean Deviation = 


M.D.(x) „ M.D.(Median) 
x Median 

Thus, if Mean Deviation is 
calculated on the basis of the Mean, it 
is divided by the Mean. If Median is 
used to calculate Mean Deviation, it is 
divided by the Median. 

For Standard Deviation, the relative 
measure is called Coefficient of 
Variation, calculated as below: 

Coefficient of Variation 


7 Standard Deviation 
Arithmetic Mean 


It is usually expressed in 
percentage terms and is the most 
commonly used relative measure of 
dispersion. Since relative measures are 
free from the units in which the values 
have been expressed, they can be 
compared even across different groups 
having different units of measurement. 


x 100 


D. LORENZ CURVE 


The measures of dispersion discussed 
so far give a numerical value of 
dispersion. A graphical measure called 
Lorenz Curve is available for estimating 
inequalities in distribution. You may 
have heard of statements like ‘top 10% 
of the people of a country earn 50% of 
the national income while top 20% 
account for 80%’. An idea about 
income disparities is given by such 
figures. Lorenz Curve uses the 
information expressed in a cumulative 
manner to indicate the degree of 
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inequality. For example, Lorenz Curve 
of income gives a relationship between 
percentage of population and its share 


of income in total income. It is specially 5. 


useful in comparing the variability of two 
or more distributions by drawing two 


or more Lorenz curves on the same axis. 6 


Construction of the Lorenz curve 


Following steps are required. 


1. Calculate class Midpoints to obtain 


Col.2 of Table 6.4. 8. 


2. Calculate the estmated total income 
of employees in each class by 
multiplying the midpoint of the 
class by the frequency in the class. 
Thus obtain Col. (4) of Table 6.4. 

3. Express frequency in each class as 
a percentage (%) of total frequency. 
Thus, obtain Col. (5) of Table 6.4. 

4. Express total income of each class 
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as a percentage (%) of the grand 
total income of all classes together. 
Thus obatain Col. (6) of Table 6.4. 
Prepare less than cumulative 
frequency and Cumulative income 
Table 6.5. 


. Col. (2) of Table 6.5 shows the 


cumulative frequency of empolyees. 
Col. (3) of Table 6.5 shows the 
cumulative income going to these 
persons. 

Draw a line joining Co-ordinate 
(0,0) with (100,100). This is called 
the line of equal distribution shown 
as line ‘OF’ in figure 6.1. 

Plot the cumulative percentages of 
empolyees on the horizontal axis 
and cumulative income on the 
vertical axis. We will the thus gate 
the line. 


Given below are the monthly incomes of employees of a company: 


TABLE 6.4 
Income Midpoint (X) Frequency (f) Total income % of frequency % of Total 
class of class (FX) income 
(1) (2) (3) (4) (5) (6) 
0-5000 2500 5 12500 10 1.29 
5000-10000 7500 10 75000 20 (al 
10000-20000 15000 18 270000 36 27.76 
20000-40000 30000 10 300000 20 30.85 
40000-50000 45000 7 315000 14 32.39 
50 972500 100 
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TABLE 6.5 

‘Less Than’ Cumulative Frequency and Income 
‘Less Than’ Cumulative Cumulative 
frequency Income 

(Rs) (%) (%) 
5,000 10 1:29 
10,000 3 9.00 
20,000 66 36.76 
40,000 86 67.61 
50,000 100 100.00 


Studying the Lorenz Curve 


OE is called the line of equal 
distribution, since it would imply a 
situation like, top 20% people earn 


100.00 


Lorenz Curve 
— — Line of Equal Distribution 


67.61 


Cumulative Percentage of Incomes 


O 10 30 66 86 100 


Cumulative Percentage of Employees 
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20% of total income and top 60% earn 
60% of the total income. The farther the 
curve OABCDE from this line, the 
greater is the inequality present in the 
distribution. If there are two or more 
curves on the same axes, the one which 
is the farthest from line OE has the 
highest inequality. 


8S. CONCLUSION 


Although Range is the simplest to 
calculate and understand, it is unduly 
affected by extreme values. QD is not 
affected by extreme values as it is based 
on only middle 50% of the data. 
However, it is more difficult to interpret 
M.D. and S.D. Both are based upon 
deviations of values from their average. 
M.D. calculates average of deviations 
from the average but ignores signs of 
deviations and therefore appears to be 
unmathematical. Standard deviation 
attempts to calculate average deviation 
from mean. Like M.D., it is based on 
all values and is also applied in more 
advanced statistical problems. Itis the 
most widely used measure of 
dispersion. 


Recap 


e A measure of dispersion improves our understanding about the 
behaviour of an economic variable. 

e Range and Quartile Deviation are based upon the spread of values. 

e M.D. and S.D. are based upon deviations of values from the average. 

e Measures of dispersion could be Absolute or Relative. 

e Absolute measures give the answer in the units in which data are 


expressed. 


e Relative measures are free from these units, and consequently 
can be used to compare different variables. 
e A graphic method, which estimates the dispersion from shape 


of a curve, is called Lorenz Curve. 
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EXERCISES 

1. A measure of dispersion is a good supplement to the central value in 
understanding a frequency distribution. Comment. 

2. Which measure of dispersion is the best and how? 

3. Some measures of dispersion depend upon the spread of values whereas 
some are estimated on the basis of the variation of values from a central 
value. Do you agree? 

4. In a town, 25% of the persons earned more than Rs 45,000 whereas 
75% earned more than 18,000. Calculate the absolute and relative values 
of dispersion. 

5. The yield of wheat and rice per acre for 10 districts of a state is as 

under: 

District 1 2 3 4 D 6 T 8 9 10 
Wheat 12 10 15 19 21 16 18 9 25 10 
Rice 22 29 12 23 18 15 12 34 18 412 
Calculate for each crop, 

(i) Range 

(ii) Q.D. 

(iii) Mean deviation about Mean 

(iv) Mean deviation about Median 

(v) Standard deviation 

(vi) Which crop has greater variation? 

(vii)Compare the values of different measures for each crop. 

6. In the previous question, calculate the relative measures of variation 
and indicate the value which, in your opinion, is more reliable. 

7. A batsman is to be selected for a cricket team. The choice is between X 


and Y on the basis of their scores in five previous tests which are: 
X 25 85 40 80 120 

X 50 70 65 45 80 

Which batsman should be selected if we want, 

(i) a higher run getter, or 

(ii) a more reliable batsman in the team? 


8. To check the quality of two brands of lightbulbs, their life in burning 


hours was estimated as under for 100 bulbs of each brand. 


Life No. of bulbs 

(in hrs) Brand A Brand B 
0-50 15 2, 
50-100 20 8 
100-150 18 60 
150-200 25 25 
200-250 22 5 
100 100 
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(i) Which brand gives higher life? 
(ii) Which brand is more dependable? 


. Averge daily wage of 50 workers of a factory was Rs 200 with a standard 


deviation of Rs 40. Each worker is given a raise of Rs 20. What is the 
new average daily wage and standard deviation? Have the wages become 
more or less uniform? 

If in the previous question, each worker is given a hike of 10 % in wages, 
how are the mean and standard deviation values affected? 

Calculate the mean deviation using mean and Standard Deviation for 
the following distribution. 


Classes Frequencies 

20-40 3 
40-80 6 
80-100 20 
100-120 12 
120-140 9 
50 

The sum of 10 values is 100 and the sum of their squares is 1090. Find 


out the coefficient of variation. 


2020-21 


